/***************************************************************************************************
 Replication do-file (Africa) — Corruption and Living Conditions (Afrobarometer Round 8)

 Paper: "It’s Not Just the Economy, Stupid: Corruption and Subjective Well Being in Africa,
        and Latin America and the Caribbean"
		
 Purpose of this do-file:
   (1) Load Afrobarometer Round 8 data
   (2) Construct individual bribery indicators (five service contexts)
   (3) Aggregate bribery exposure to the regional level (REGION)
   (4) Construct living-standards outcome and controls
   (5) Estimate ordered probit models with clustered SEs and post-estimation effects

**************************************************************************************************/


/*** 0) Housekeeping and data load ***/
clear

* Load Afrobarometer Round 8 dataset (update path as needed)
use round8


/*** 1) Clustering identifier ***/
* Cluster standard errors at the COUNTRY × REGION level
* (i.e., all respondents within a subnational region in a country)
egen clustervar = group(COUNTRY REGION)


/***************************************************************************************************
 2) Bribery indicators (individual level) and regional incidence
    Afrobarometer bribery questions (Q44*) ask about paying a bribe in specific contexts.
    Coding implemented here:
      - bribe = 1 if Q44* in {1,2,3} (some frequency of bribe payment)
      - bribe = 0 if Q44* in {0,7}   (never / no experience)
      - missing otherwise (.)
***************************************************************************************************/

***** generate regional incidence of bribery

* School-related bribe (Q44C)
gen schoolbribe = .
replace schoolbribe = 1 if inlist(Q44C, 1, 2, 3)
replace schoolbribe = 0 if inlist(Q44C, 0, 7)

* Medical-related bribe (Q44F)
gen medicalbribe = .
replace medicalbribe = 1 if inlist(Q44F, 1, 2, 3)
replace medicalbribe = 0 if inlist(Q44F, 0, 7)

* Documents/permits bribe (Q44I)
gen docbribe = .
replace docbribe = 1 if inlist(Q44I, 1, 2, 3)
replace docbribe = 0 if inlist(Q44I, 0, 7)

* Police bribe to obtain help (Q44L)
gen policebribehelp = .
replace policebribehelp = 1 if inlist(Q44L, 1, 2, 3)
replace policebribehelp = 0 if inlist(Q44L, 0, 7)

* Police bribe to avoid problems (Q44N)
gen policebribeprob = .
replace policebribeprob = 1 if inlist(Q44N, 1, 2, 3)
replace policebribeprob = 0 if inlist(Q44N, 0, 7)


/*** 2.1) Overall bribery exposure measures ***/
* Total count of bribery contexts experienced (simple sum; missing propagates if any component missing)
gen bribetotal = schoolbribe + medicalbribe + docbribe + policebribehelp + policebribeprob

* Dummy: any bribery experienced in at least one context
* - coded 1 if bribetotal > 0
* - coded 0 if bribetotal == 0
* - missing if bribetotal missing
gen bribetotaldum = 0
replace bribetotaldum = 1 if bribetotal > 0
replace bribetotaldum = . if bribetotal == .

/*** 2.2) Regional means (contextual corruption) ***/
* Mean of the “any bribery” dummy by REGION: proxy for regional bribery incidence
by REGION, sort : egen float regbribe = mean(bribetotaldum)

* Region-level means for each bribery context (useful for context-specific models)
by REGION, sort : egen float regschoolbribe = mean(schoolbribe)
by REGION, sort : egen float regmedicalbribe = mean(medicalbribe)
by REGION, sort : egen float regdocbribe    = mean(docbribe)
by REGION, sort : egen float reghelpbribe   = mean(policebribehelp)
by REGION, sort : egen float regprobbribe   = mean(policebribeprob)


/***************************************************************************************************
 3) Outcome variable: living conditions (higher = better)
***************************************************************************************************/

**** living conditions (bigger is better)

* Living standards outcome from Q4B; treat special codes as missing
gen standards = Q4B
replace standards = . if inlist(Q4B, 8, 9, -1)


/***************************************************************************************************
 4) Controls
    This section constructs individual-level covariates and regional poverty.
***************************************************************************************************/

**** controls

* Gender of the respondent (female = 1, male = 0)
gen female = .
replace female = 1 if Q101 == 2
replace female = 0 if Q101 == 1

* Urban (urban = 1, rural = 0)
gen urban = 1
replace urban = 0 if URBRUR == 2


/*** 4.1) Age ***/
* Rename Q1 to age and set special codes to missing
rename Q1 age
replace age = . if age == 999
replace age = . if age == 998

* Quadratic term to allow nonlinear age relationship
gen ageSQ = age*age


/*** 4.2) Poverty index (average scarcity across five items) ***/
* Each item is cleaned by setting special codes to missing, then averaged

rename Q7A food
replace food = . if food == -1
replace food = . if food == 9
replace food = . if food == 998
replace food = . if food == 8

rename Q7B water
replace water = . if water == -1
replace water = . if water == 9
replace water = . if water == 998
replace water = . if water == 8

rename Q7C medical
replace medical = . if medical == -1
replace medical = . if medical == 9
replace medical = . if medical == 998
replace medical = . if medical == 8

rename Q7D cooking
replace cooking = . if cooking == -1
replace cooking = . if cooking == 9
replace cooking = . if cooking == 998
replace cooking = . if cooking == 8

rename Q7E cash
replace cash = . if cash == -1
replace cash = . if cash == 9
replace cash = . if cash == 998
replace cash = . if cash == 8

* Poverty index: mean of the five deprivations (higher likely = more deprivation)
gen poverty = (food + water + medical + cooking + cash)/5

* Regional poverty: average poverty by REGION (contextual economic conditions)
by REGION, sort : egen float regpoverty = mean(poverty)


/*** 4.3) Education (collapsed categorical scale + labels) ***/
rename Q97 education
replace education = . if education == 99
replace education = . if education == -1
replace education = . if education == 98

* Collapse raw categories into five groups
replace education = 1 if inlist(education, 0, 1, 2)
replace education = 2 if inlist(education, 3, 4)
replace education = 3 if inlist(education, 5)
replace education = 4 if inlist(education, 6, 7)
replace education = 5 if inlist(education, 8, 9)

* Apply value labels for readability
label define educa ///
  1 "Less than full primary" ///
  2 "Primary or Some Secondary" ///
  3 "Secondary" ///
  4 "Post Secondary Qualification or Some Uni." ///
  5 "Uni. Complete & Postgrad"
label values education educa


/*** 4.4) Religion (categorical control) ***/
gen religion = RELIG_COND
replace religion = . if RELIG_COND == 9


/***************************************************************************************************
 5) Missingness diagnostics and analytic sample check
***************************************************************************************************/

* Create missingness indicators for listed variables (miss_* variables)
misstable summarize standards bribetotaldum regbribe age ageSQ female education poverty regpoverty urban religion COUNTRY, gen(miss_)

* Descriptive stats on the complete-case sample used in core models
summarize standards bribetotaldum regbribe age female education poverty regpoverty urban religion ///
  if miss_standards==0 & miss_bribetotaldum==0 & miss_age==0 & miss_ageSQ==0 & ///
     miss_education==0 & miss_poverty==0 & miss_religion==0


/***************************************************************************************************
 6) Main results: Ordered probit models + marginal effects
    Outcome: standards (ordinal)
    Key predictors:
      - bribetotaldum: individual experience with bribery
      - regbribe: regional bribery incidence (context)
    Controls: age, age^2, gender, education, poverty, regpoverty, urban, religion, country FE
    SEs clustered at country×region (clustervar)
***************************************************************************************************/

**** Results

/*** 6.1) Overall sample ***/
xi: oprobit standards bribetotaldum regbribe age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar)

* Average marginal effects for each covariate on each outcome category
margins, dydx(*) predict(outcome(1))
margins, dydx(*) predict(outcome(2))
margins, dydx(*) predict(outcome(3))
margins, dydx(*) predict(outcome(4))
margins, dydx(*) predict(outcome(5))

* Change in predicted probabilities associated with regbribe (at means), with CIs
mchange regbribe, atmeans stats(ci) centered

    ** living in a more corrupt region is bad for your living standards


/*** 6.2) Women subsample ***/
xi: oprobit standards bribetotaldum regbribe age ageSQ i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar), if female == 1

* Marginal effects by outcome for women
mfx compute, predict(outcome(1))
mfx compute, predict(outcome(2))
mfx compute, predict(outcome(3))
mfx compute, predict(outcome(4))
mfx compute, predict(outcome(5))


/*** 6.3) Men subsample ***/
xi: oprobit standards bribetotaldum regbribe age ageSQ i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar), if female == 0

* Marginal effects by outcome for men
mfx compute, predict(outcome(1))
mfx compute, predict(outcome(2))
mfx compute, predict(outcome(3))
mfx compute, predict(outcome(4))
mfx compute, predict(outcome(5))

    ** Not much evidence of a different effect for men vs women


/***************************************************************************************************
 7) Which bribery context matters most?
    Replace regbribe with each context-specific regional bribery rate.
    The last model includes all context measures simultaneously.
***************************************************************************************************/

*** what context is worst?
xi: oprobit standards bribetotaldum regschoolbribe age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar)
xi: oprobit standards bribetotaldum regmedicalbribe age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar)
xi: oprobit standards bribetotaldum regdocbribe    age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar)
xi: oprobit standards bribetotaldum reghelpbribe   age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar)
xi: oprobit standards bribetotaldum regprobbribe   age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar)

* All contexts included jointly (compare coefficients/marginal effects across contexts)
xi: oprobit standards bribetotaldum regschoolbribe regmedicalbribe regdocbribe reghelpbribe regprobbribe ///
  age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar)

    *** Interpretation note (author): corrupt "officialdom" bad for living standards.
    *** School and medical sector corruption does not hurt CURRENT living conditions on average.
    *** A transactional police force is good for living conditions (hypothesis: pay-for-service).


/***************************************************************************************************
 8) Do context effects differ by gender?
    Re-estimate the full-context model separately for women and men.
***************************************************************************************************/

*** are different contexts worse for men v women?
xi: oprobit standards bribetotaldum regschoolbribe regmedicalbribe regdocbribe reghelpbribe regprobbribe ///
  age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar), if female == 1

xi: oprobit standards bribetotaldum regschoolbribe regmedicalbribe regdocbribe reghelpbribe regprobbribe ///
  age ageSQ female i.education poverty regpoverty urban i.religion i.COUNTRY, cluster(clustervar), if female == 0

*** again, some differences in magnitude but men and women both suffer and gain from the same corruption contexts
